This project implements P3104: Bit Permutations. Ironically, while this project has "C++26" in its name, the proposal didn't make it into C++26, but will almost certainly be in C++29. See cplusplus/papers#1768 for its status.
There is a single-header reference implementation of the functions
bit_reversebit_repeatnext_bit_permutationprev_bit_permutationbit_compressr=bit_compressbit_expandr=bit_expandbit_compresslbit_expandl
There are also implementations of existing <bit> functions for the purpose of testing.
The standard library functions don't support _BitInt or 128-bit integers, so it was necessary
to circumvent them:
popcountcountl_zerocountl_onecountr_zerocountr_one
There are some additional functions from other proposals:
clmulclmul_wideshlshr
All functions are located in namespace cxx26bp.
This implementation aims to provide the fastest possible library implementation for each of these functions, using any possible hardware support. This project is portable, and tries to support
- Architectures: x86, ARM
- Compilers: MSVC, GCC, Clang
template<class T>
constexpr T bit_reverse(T x) noexcept;Constraints:
T is an unsigned integer type.
Returns:
x with the order of bits reversed.
That is, the least significant bit of x
is the most significant bit of the result,
and vice versa.
Mathematically, given the width T,
the
On ARM, this function has the same effect as the rbit instruction.
template<class T>
constexpr T bit_repeat(T x, int l);Constraints:
T is an unsigned integer type.
Preconditions:
l > 0.
Returns:
A bit pattern stored in x of length l,
repeated as many times as fits into the result.
Any extraneous bits are truncated.
Mathematically, the
// also known as bit_compressr
template<class T>
constexpr T bit_compress(T x, T m) noexcept;Constraints:
T is an unsigned integer type.
Returns:
Each bit of x where the corresponding bit in m is 1,
contiguously packed into the least significant bits of the result,
with their relative order preserved.
Any more significant bits of the result are 0.
On x86, this function has the same effect as the pext instruction.
template<class T>
constexpr T bit_compressl(T x, T m) noexcept;Effects:
Equivalent to bit_reverse(bit_compressr(bit_reverse(x), bit_reverse(l))).
// also known as bit_expandr
template<class T>
constexpr T bit_expand(T x, T m) noexcept;Constraints:
T is an unsigned integer type.
Returns:
The least significant bits of x at the bit positions where m has a 1 bit,
with their relative order preserved.
Any other bits of the result are 0.
On x86, this functions has the same effect as the pdep instruction.
template<class T>
constexpr T bit_expandl(T x, T m) noexcept;Effects:
Equivalent to bit_reverse(bit_expandr(bit_reverse(x), bit_reverse(l))).
template<class T, class S>
constexpr T shl(T x, S s) noexcept;Constraints:
Each of T and S is a signed or unsigned integer type.
Returns:
template<class T, class S>
constexpr T shr(T x, S s) noexcept;Constraints:
Each of T and S is a signed or unsigned integer type.
Returns:
template<class T>
struct wide_result {
T low_bits;
T high_bits;
// ...
};
template<class T>
constexpr T clmul_wide(T x, T y) noexcept;Constraints:
T is an unsigned integer type.
Returns:
The carry-less product (aka XOR product) of x and y.
The low bits of this product are stored in low_bits
of the wide_result object,
and the high bits are stored in high_bits.
template<class T>
constexpr T clmul(T x, T y) noexcept;Effects:
Equivalent to clmul_wide(x, y).low_bits.